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Beta-actin gene and regulatory elements, preparation and use. 
© DNA sequences are provided for production of (3-actin or 
untranslated regions of p-actin genes may be employed in 
conjunction with genes encoding for polypeptides for effi- 
cient expression in mammalian hosts. Particularly, the 
transcriptional and translational initiation and termination 
regions may be employed, by themselves or in combination 
with intron sequences for expression of various polypeptides 
in mammalian host cells. 
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BETA-ACT-JN GENE AND REGULATORY ELEMENTS, 
PREPARATION AND USE 

BACKGROUND OF THE INVENTION 
Field of the Invention 

Expression in mammalian hosts offers many 
opportunities for the production of mammalian proteins, 
not available to unicellular microorganism hosts. By 
employing mammalian hosts, one can produce polypeptides 
which are properly processed, so as to be identical in 
composition to the native or wild-type protein, includ- 
ing glycosylation, methylation, methionine removal, 
N-terminal acetylation or formylation, and the like. 
Furthermore, there may be substantial efficiencies in 
translation, with concomitant reduction in mutation. 

There is also a significant interest in natu- 
rally occurring proteins or alleles or mutants thereof, 
not only for use in research and therapy, but also for 
commercial purposes, where such polypeptides or proteins 
may serve in a variety of applications, such as polymeric 
units, additives, modifiers, bulking agents, or the 
like. In many situations it will be desirable that a 
mature polypeptide or protein is obtained, so that the 
final product has physical and chemical characteristics 
associated with the natural product. 

It is therefore of interest to develop a port- 
folio of regulatory sequences which can be used in the 
transcription and translation of naturally occurring 
polypeptides and proteins including alleles, as well as 
mutants thereof or totally synthetic polypeptides and 
proteins based on modifications of naturally occurring 
analogs . 

Furthermore, the protein p-actin serves a 
variety of structural purposes in the cell. The protein 
is particularly interesting for its ability to provide 
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fibrous and film structures which can find commercial 
use as membranes, fibers, and the like. 
Description of the Prior Art 

Seed, Nuc. Acid Res. (1983) 11:2427-2446 de- 
5 scribes a method for selecting genomic clones by homolo- 
gous recombination. The nucleotide sequence for the 
mRNA derived from a p- act in cDNA clone is reported by 
Ponte et al . , ibid (1984) 12:1687-1696. Vandekerckhove , 
Cell (1980) 22:893-899, reports coexpression of a mutant 

10 p-actin with two normal p-actins in a stably transformed 
human cell line. Ponte et al . , Mol. Cell Biol. (1983) 
3:1783-1791, report the presence of a large multi -pseudo- 
gene subfamily for p-actin. Ponte et al., ibid , also 
reports the 3 '-untranslated regions of p-actin as iso- 

15 type- specific. Nudel et al. , Nucleic Acids Res. (1983) 
11:1759-1771, predicted four intron sequences within 
the coding region of p-actin. 



SUMMARY OF THE INVENTION 
p-actin gene alleles including flanking DNA 

20 regulatory regions and introns are provided for expres- 
sion of p-actin f as well as a source of regulatory DNA 
sequences including introns for use with other genes 
for expression in mammalian hosts. The 5 1 -untranslated 
region can be used as a transcriptional and translational 

25 region in combination with structural genes, where the 
structural gene may be modified by insertion of one or 
more introns for efficient processing of the initial 
transcription product to produce a mature messenger 
RNA. An homologous recombination technique is employed 

30 for isolation of complete p-actin genes capable of ex- 
pression of p-actin in a mammalian host. 

BRIEF DESCRIPTION OF THE FIGURES 
Fig. 1 is a diagrammatic depiction of plas- 
mid 7tAN7pl; and 
35 Fig. 2 is a restriction endonuclease map and 

structure of the human p-actin gene Ml ( pi) -2. 
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DESCRIPTION OF THE SP ECIFIC EMBODIMENTS 

Polynucleotide sequences, combinations of the 
polynucleotide sequences, self-replicating constructs, 
host cells containing the constructs and methods employ- 
ing the various polynucleotide compositions are provided 
for the expression of p-actin or other polypeptides in 
mammalian, particularly primate, cells. The sequences 
include the chromosomal gene for one or more allelic p- 
actins, including the flanking regions for the struc- 
tural gene having transcriptional regulatory and transla- 
tional initiation and termination sequences, coding 
sequences, intron sequences and cDNA encoding for one 
or more p-actin polypeptides. The sequences can be 
employed for expression of p-actin or fragments thereof, % 
particularly fragments involving individual or combined 
exons, either combined as to p-actin exons or exons 
expressing other polypeptides. Also, the sequences may 
find use as probes for the determination of the presence 
of exons, introns, or flanking regions associated with 
p-actin in a mammalian, particularly primate, cell or 
other genes having homologous or partially homologous 
sequences . 

A p-actin chromosomal DNA sequence including 
5'- and 3* -flanking regions, introns and exons from a 
particular fetal source is set forth in the Experimental 
section, p-actins from other human sources will gene- 
rally have at least 93 number percent of the same amino 
acids, usually at least 98 number percent, demonstrating 
substantial homology between the different p-actins. 
The p-actin structural gene including exons and introns 
will generally be about 3500 to 3600, more usually about 
3550 nucleotides, inclusive of intron I, which is up- 
stream from the initiation codon and intermediate the 
initiation codon and the TATA box. The complete cDNA 
sequence coding for p-actin will generally be of from 
about 2025 to 2125 nucleotides. The TATA box will gene- 
rally be about 920 to 960, more usually about 940 nucleo- 



4 0174608 

tides from the initiation codon. In the sequence in the 
Experimental section, the TATA box begins at -28 and ter- 
minates at -22, while the initiation codon begins at 
-916. 

5 Intron I is subject to polymorphisms associated 

with different p-actin alleles. Intron I is indicated 
as beginning at nucleotide 79 and terminating at nucle- 
otide 909. The polymorphic region is in the region of 
about 103 to 118 as numbered in the sequence. This re- 

10 gion may be varied widely, where the sequence indicated 
has 16 base pairs (bp), other sequences may have up to 
34bp or higher. The 5 '-flanking region of p-actin may 
begin with the nucleotide designated as -28 in the se- 
quence or be extended farther upstream, so that the TATA^ 

15 box, could be at a position 500, or even 3500 or more 
base pairs downstream from a restriction site in the 
chromosomal fragment, so as to provide for a greater 
non- transcribed region. 

Alternatively, the TATA box may be only 25 to 

20 50bp downstream from the initial nucleotide of the nat- 
urally occurring nucleotides present in the chromosomal 
sequence. Conveniently, all or a portion of intron I 
may be removed, desirably retaining the termini of intron 
I, where at least a portion of intron I is retained. 

25 Thus, one would wish to retain the splicing donor and 
acceptor sequences of intron I as well as at least one, 
preferably at least two, of the nucleotides flanking 
the intron, in order to favor accurate splicing. In 
this manner, transcriptional initiation and processing 

30 of the resulting messenger RNA may be efficiently a- 

chieved with DNA sequences coding for other than p-actin. 
Desirably, the DNA sequence from the terminus of intron 
I to the initiation codon can also be retained, so that 
any foreign DNA joined to that sequence would be joined 

35 to all or substantially all of the DNA upstream from 

the initiation codon of p-actin. Also, the 5 f - sequence 
may extend into the coding region, usually not past the 
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twelfth nucleotide, more usually not past the ninth nu- 
cleotide . 

In some instances it may also be desirable to 
employ introns II, III, IV and/or V in a construction with 
a structural gene other than p-actin. In these situa- 
tions, it would be desirable that the nucleotides imme- 
diately adjacent to the termini of the introns, which 
are part of the structural gene coding for the foreign 
protein have the same nucleotides, at least to the extent 
of one or two nucleotides, or be a transition, rather 
than a transversion, replacing a purine or pyrimidine 
with a purine or pyrimidine respectively. This may 
provide for enhanced accuracy in splicing. Any modifi- 
cation of the introns should preserve the AG and GT % 
donor and acceptor splicing signals of the intron. 

Any structure involving a foreign protein and 
one or more p-actin introns would involve fragmenting 
the structural gene encoding for the foreign protein, 
desirably of fragments of at least about 20 nucleotides, 
preferably of at least about 50 nucleotides, where the 
fragments can be conveniently ligated to the one or 
more introns. Conveniently, adapters may be used having 
appropriate termini, either cohesive or blunt, where 
the adapters may extend into the intron and/or exon. 

The intron may be prepared by cloning the 
sequences, having derived them from p-actin genes, em- 
ploying restriction enzyme digestion, exonuclease diges- 
tion, or the like, combinations of naturally occurring 
DNA sequences ligated to synthetic sequences, or combi- 
nations thereof. It may be desirable in some instances 
to mutagenize one or more nucleotides internal to an 
intron, so as to provide for a convenient restriction 
site, where relatively short adapters, generally from 
about 20 to 100 nucleotides may be prepared which can 
be used to join the intron to the exon to provide for 
splicing of two exons in prqper reading frame. Alterna- 
tively, portions of the intron may be removed, for exam- 
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pie, 10-90 percent of the base pairs, so long as the 
intron retains its capability of being excised in an 
appropriate host, e.g., mammalian, particularly mouse 
or primate. 

5 Conveniently, the 3 1 - untranslated region of 

a p-actin gene may be employed for transcription and 
translational signals, particularly translational, 
since the structural gene will normally include one or 
more stop codons in reading frame with the mRNA coding 

10 sequence. Usually, the 3 1 - region will be at least 

lOObp, more usually at least 200bp, and may be 650bp or 
more depending upon the particular construction. 

Expression of p-actin or foreign protein in- 
volving one or more introns may be achieved in a variety^ 

15 of ways in mammalian host cells. The coding construc- 
tion involving the p-actin transcriptional initiation 
region, introns as appropriate and the structural gene 
present as a contiguous entity or as exons separated by 
one or more of the p-actin introns may be joined to an 

20 appropriate vector. By a vector is intended a replica- 
tion system recognized by the intended host, where usu- 
ally there is present one or more markers to ensure the 
stable maintenance of the DNA construct in the host. 

Various replication systems include viral re- 

25 plication systems, such as retroviruses, simian virus, 
bovine papilloma virus, or the like. Alternatively, one 
may combine the DNA construct with a gene which allows 
for selection in a host. This gene can complement an 
auxotrophic host or provide protection from a biocide. 

30 Illustrative genes include thymidine kinase, dihydrofo- 
late reductase, which provides protection from methotrex- 
ate, or the like. For the most part, markers will pro- 
vide resistance to a biocide, e.g., G418, methotrexate, 
etc.; resistance to a heavy metal, e.g., copper; proto- 

35 trophy to an auxotroph; or the like. Genes which find 
use include thymidine kinase, dihydro folate reductase, 
metallothionein, and the like. In addition, the subject 
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gene may be joined to an amplifiable gene, so that multi- 
ple copies of the structural gene of interest may be made. 
Depending upon the particular system, the gene may be " 
maintained on an extrachromosomal element or be inte- 
5 grated into the host genome. 

The foreign gene may come from a wide variety 
of sources, such as prokaryotes, eukaryotes, pathogens, 
fungi, plants, mammals, including primates, particularly 
humans, or the like. These proteins may include hormones, 

10 lymphokines, enzymes, capsid proteins, membrane proteins, 
structural proteins, growth factors and inhibitors, 
blood proteins, immunoglobulins, etc. The manner in 
which an individual DNA sequence coding for a protein 
of interest is obtained, divided into individual exons, . 

15 and joined to the one or more introns and transcriptional 
and translational regulatory signals of p-actin will 
depend upon each individual polypeptide of interest, as 
well as the information available concerning the DNA 
sequence coding for such polypeptide. 

20 The p-actin promoter or transcription system 

including the promoter may be used for the regulation 
of expression of other genes by regulating transcrip- 
tion of mRNA complementary to another mRNA or portion 
thereof. In effect, the p-actin promoter would regu- 

25 late transcription of the nonsense strand or portion 

thereof of the gene whose expression is to be inhibited. 
Such inhibition may find use in making an auxotrophic 
host, inhibiting one pathway in favor of another meta- 
bolic pathway, reversing or enhancing oncogenic charac- 

30 teristics of a cell, or the like. 

Introduction of the DNA into the host will 
vary depending upon the particular construction. Intro- 
duction can be achieved by trans fection, transformation, 
transduction, or the like, as amply described in scien- 

35 tific literature. The host cells will normally be im- 
mortalized cells, that is, cells that can be continuous- 
ly passaged in culture. For the most part, these cells 
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will be neoplastic and may be any convenient mammalian 
cell, which is able to express the desired polypeptide, 
and where necessary or desirable, process the polypep- 
tide, so as to provide a mature polypeptide. By pro- 
5 cessing is intended glycosylation, methylation, terminal 
acylation, e.g., formylation or acetylation, cleavage, 
or the like. In some instances it may be desirable to 
provide a leader sequence providing for secretion or 
directing the product to a particular locus in the cell. 

10 For secretion, the host should be able to recognize the 
leader sequence and the processing signal for peptidase 
cleavage and removal of the leader. 

The isolation, cloning and verification of 
having a functional p-actin gene is complicated by the 

15 existence of numerous pseudogenes. Thus, strategies 

must be designed which ensure that the sequence obtained 
is a functional p-actin gene. Furthermore, by having a 
functional p-actin gene one can employ either untrans- 
lated or translated sequences as probes for determining 

20 the presence of other p-actin genes in a mammalian cell. 
The subject strategy for isolating and verifying the 
cloning of a p-actin gene included selecting genomic 
clones by homologous recombination. 

The method employs a miniplasmid into which 

25 is inserted a fragment of either the untranslated region 
or. translated region of a p-actin gene. Such a fragment 
may be obtained by isolation of a portion of the messen- 
ger RNA for p-actin. In the subject strategy, the frag- 
ment employed was from the 3 '-untranslated region. The 

30 idea was that homologous recombination would occur with 
the greatest frequency with those seguences carrying 
the p-actin gene and having the highest degree of homo- 
logy with the fragment present in the miniplasmid. 

The recombination screen is conveniently car- 

35 ried out with a phage library as described by Seed, 

supra . A host is selected which is recombinant profi- 
cient and in which the viral vector of the library is 
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unable to propagate. Therefore, only those viruses 
which undergo recombination with the miniplasmid will 
survive and can be isolated. Where the miniplasmid has" 
a unique restriction site, and the same recognition 
5 sequence exists in the p-actin gene, it is feasible to 
screen fragments resulting from digestion of the recom- 
binant phage to detect the presence of a fragment having 
the correct size. In this manner, pseudogenes may be 
distinguished from true genes. 

10 Demonstration of p-actin alleles or mutants 

can be achieved by employing two different phage vectors, 
where each of the vectors have substantially different 
size packaging requirements, so that groups of fragments 
are separated by size. These hybrid phages are then 

15 combined with the miniplasmid containing the appropriate 
p-actin gene fragment for homologous recombination in 
an appropriate host. Those phage that propagate are 
then screened with an appropriate probe. It is found 
that the phage which includes fragments in the range of 

20 about 10 to 23kb provides a number of clones which in- 
clude the complete p-actin gene, while the phage which 
includes fragments of 2 up to 13kb genomic DNA are found 
not to have clones with a complete p-actin gene, but 
rather appear to be pseudogenes. 

25 The recombinant DNA produced p-actin can be 

used in a variety of ways. The protein is fibrous and 
can be used to make fibers or other structures. Fur- 
thermore, based on the differences between p- and y- 
actins, one can modify the p-actin to change its struc- 

30 tural properties. Thus, a variety of p-actins having 
different chemical and physical properties can be pro- 
duced which can be used by themselves or in combination 
with other polyamides for the production of a wide va- 
riety of articles, such as fibers, films, formed objects, 

35 or the like. These pure fiber subunits will be synthe- 
sized in pro- and eukaryotes. 
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The DNA sequences which are provided can be 
used as probes, being used to detect mutational defects 
in 0-actin and relating the mutational defects to cyto- 
skeletal dysfunction as well as altered cellular pheno- 
type. 

The following examples are offered by way of 
illustration and not by way of limitation. 

EXPERIMENTAL 
Materials and Methods 

General Methods . 

Growth and transformation of E. coli , colony 
hybridization (Grunstein and Hogness, Proc. Natl. Acad. 
Sci. USA (1975) 72:3961-3965), and purification of plas- % 
mid DNA followed standard protocols as described previ- 
ously (Childs et al. , Dev. Biol. (1979) 73; 153-173). 
Preparation of Charon 4A and AgtWES phage recombinant 
DNA, agarose gels and hybridization blots, and the con- 
ditions used for hybridization were as described previ- 
ously (Ponte et al., Mol. Cell Biol. (1983) 3:1783-1791). 
Genomic DNA preparation from mammalian cells, DNA diges- 
tion with restriction enzymes, and hybridizations per- 
formed on nitrocellulose blots with dextran sulfate 
present were conducted as described by Ponte et al. , 
Nature (1981) 291; 594-596. The human cell strains were 
grown and maintained as previously described (Leavitt 
and Kakunaga, J. Biol. Chem. (1980) 255:1650-1661). 

Construction of the KD, HuT-14, and HuT-14T Human Gene 
Libraries . 

Purified A Charon 4A (Blattner et al - , Science 
(1977) 196:161-169) vector DNA (EcoRI arms), AgtWESAB 
(Leder et al., Science (1977) 196:175-178) vector DNA 
(full length phage genome) and packaging extracts pre- 
pared from E. coli strains BHB2688 and BHB2690 were 
purchased from Amersham (Arlington Heights, IL). Fully 
or partially EcoRI digested fragments from genomic DNA, 
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2kb to 14kb or lOkb to 23kb, were purified from 5,5% 
agarose gels [Seakem HGT(P) ] by adsorption to glass 
powder (Vogelstein and Gillespie, Proc. Natl. Acad. 
Sci. USA (1979) 76:615-619). Two to 14kb EcoRI DNA 
5 fragments were ligated to XgtWES DNA arms that were 
generated by EcoRI and Sac I digestion of XgtWESXB DNA. 
Ten kb, or 12kb to 23kb, DNA EcoRI fragments (full or 
partial digests) were ligated into X Charon 4A EcoRI 
arms. The ligation reaction consisted of 1 part human 

10 insert DNA and 3 parts vector DNA, 66mM Tris-HCl pH7.4, 
5mM MgCl 2/ ImM ATP, 5mM dithiothreitol , lOOpg bovine 
serum albumin (Fraction 5), and T4 ligase. Ligation 
reactions (13°C overnight) were always tested for com- 
pletion by agarose gel analysis of reaction aliquots 

15 taken at the beginning and ends of the ligation reaction. 
Four pi of the ligation reaction products were mixed 
with the two packaging extracts and phage assembly was 
allowed for two hours at room temperature . Packaging 
reactions were then diluted with 0.5ml of phage dilution 

20 buffer (lOmM Tris-HCl pH 7.4, lOmM MgS0 4 , and 0.01% 

gelatin) followed immediately by IOjjI of chloroform and 
storage at 4°C. Packaging titers were determined by 
infection of E. coli LE392. 



Construction of the nAN7pl Miniplasmid . 

25 A 600bp EcoR I to BamH I fragment of the cDNA 

(p-actin 3'UTR sequence) insert in pHFpA-3 f UT (Ponte et 
al. (1983), supra ) was purified by gel electrophoresis 
and adsorption to glass powder and then ligated to the 
EcoR I to BamH I large fragment (alkaline phosphatase 

30 treated) of plasmid nAN7. (A derivative of nVX (Seed 
(1983), supra , which contains the tyrosinyl suppressor 
tRNA gene (SupF) and a polylinker with eight restriction 
sites. Also, the colicin El replicon is present, see 
Fig. 1. The 600bp 3 1 -untranslated sequences (3 ! UTR) 

35 EcoR I- BamH I fragment is inserted into the restriction 

sites of the polylinker.) E. coli W3110(p3) (need cita- 
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tion) was transformed with the ligation mixture and 
plasmid DNA from individual ampicillin-resistant (Amp r ) 
and tetracycline-resistant (Tet r ) colonies was amplified. 
The structure of nAN7pi (the 3 f -UTR sequence is oriented 
5 so that the Sai l site in the miniplasmid is placed near 
the junction between the 3 '-terminus of the 3 '-UTR and 
the miniplasmid) was confirmed by restriction analysis 
and DNA blotting experiments. 

Selection of nAN7pl Recombinant X Phage . 

10 A recombination screen (Seed (1983), supra ; 

DiMaio et al . , Mol. Cell Biol. (1984) 4:340-350) to 
isolate phage containing DNA homologous to the 3 f UTR 
sequence in nAN7pl from a highly amplified gene library 
(Maniatis et al . , Cell (1978) 15:687-701) was performed. 

15 The library was prepared by ligation of partial EcoR I 
digests of DNA derived from a human fetus to the Charon 
4A vector. Phage stocks were prepared by infecting 
bacteria carrying nAN701 with 10 6 PFU of the Charon 4A 
library. Phage able to form plaques on W3110(Su") bac- 

20 teria were present in the lysate at frequencies between 

-7 -9 
10 and 10 . See Table 1. 

The presence of actin coding sequences as 

well as the 3 'UTR and plasmid vector sequence in these 

rare clones was confirmed by blotting experiments on 

25 Southern transfers of restriction endonuclease-digested 

DNA isolated after propagation of phage from individual 

plaques . 

Recombination screens were then performed as 
above on unamplified phage in packaging reactions that 

30 were generated by ligation of EcoR I digested HuT-14 and 
HuT-14T DNA ligated to the XgtWES vector arms (Leder et 
al . (1977), supra ) and phage packaging reactions that 
were generated by ligation of EcoR I digested KD, HuT-14 
and HUT-14T DNA (cell line sources) ligated to the Charon 

35 4A vector arms. Frequencies of recovery of library 
phage clones by recombination selection that contain 
the 0-actin gene are presented in Table -2. 
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J.5 

A recombination was performed in which 10 6 
PFU of library phage were amplified by infection in the 
recombinant proficient E. cpli strain WoP3nAN7pi. Lytic 
progeny phage from the amplification were used to infect 
a host strain (WoP3SupO) in which Charon 4A phage do 
not propagate, so that no lytic plagues are produced in 
the absence of recombination. Infection of the host 
produced plaques at a consistent frequency between 10 7 - 
10 of its true titer. All phage that were isolated 
contained actin coding sequences and had undergone re- 
combination with the 7iAN7pi plasmid. 

Five distinct phage clones were selected as 
set forth in Table 1, with the sizes of the EcoRI frag- 
ments containing coding or non-coding 3»-UTR sequences 
indicated. In the recombination trial, 50 of the 51 
plaques isolated were identical and designated Ml(pl)-i. 
In addition to three EcoRI fragments that contained 
actin coding sequences (5.0kb, 1.4kb, 1.5kb), one addi- 
tional EcoRI fragment (3.5kb) which lacked an actin 
coding sequence was common to all 50 isolates. A single 
additional plague (Ml(pi)-2) contained a different phage 
with a different set of EcoRI fragments: three fragments 
contained actin sequences (6.6kb, 7.1kb and 1.5kb) and 
two fragments lacked actin sequences (2.0kb and 1.2kb). 

A second recombination trial produced three 
additional and still different recombinant clones (Table 
1.) . The recovery of different plague types during inde- 
pendent trials was interpreted as being a result of the 
skewed nature of the human lambda library as well as 
the degree of sequence similarity between the nAN7p-actin 
insert and the various genomic p -actin sequences. 

Ml(pl)-2 was distinguished from the other 
isolates in that it hybridized to a probe that contained 
the 5 '-actin coding sequence (codons 1-98). Sail diges- 
tion of Ml(pi)-2 generates a 2500bp fragment that con- 
tains most of the coding sequences for p-actin plus the 
3'UTR sequence. The nucleotide sequence of the fragment 
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was determined, which confirmed the position of the 
Sai l site at codon 10 and the existence of four intron 
regions, the sum of whose lengths is 731bp. Furthermore, 
the nucleotide sequences of the coding regions of 
5 Ml(pl)-2 was shown to be identical to the p-actin cDNA 
sequence. Restriction mapping of lambda clone Ml (pi) -2 
demonstrated the presence of the p-actin sequence on a 
12. 2Kb genomic fragment which divided into two EcoR I 
fragments of 6.6 and 7.1kb by nAN7pi recombination. 

10 Size fractionated EcoRI fragments ranging 

from 10 to 12kb and larger from HuT-14 and HuT-14T DNA 
were used to prepare recombinant phage. See Table 2. 
Amplification aliquots (10 4 packaging events) were first 
screened by nAN7pl recombinant selection to determine 

15 which library aliquots contained any p-actin genes or 
pseudogenes. Those library aliquots that contained p- 
actin 3 1 UTR sequences were rescreened by conventional 
in situ plaque hybridization to select clones that hy- 
bridized to the 3 f UTR probe. Following purification, 

20 each p-actin clone was recombined with nAN7pi and the 

recombinant forms examined by EcoR I and Sai l restriction 
endonuclease digestion and the resulting DNA fragments 
hybridized with intron I, 3 ! UTR and coding probes to 
fully assess their identity and relatedness. Table 3 

25 summarizes the characteristics of each clone that was 
isolated in this way. 
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In total, eight of ten isolates from HuT-14T 
DNA and five of eight isolates from HuT-14 DNA contained 
a p-actin gene similar to -the that found in Ml(pi)-2, 
each of these separate clones hybridizing strongly to 
5 the intron probe, in addition, the nAN7pl recombinants 
contained the characteristic 2.5kb Sail restriction 
endonuclease fragment carrying the p-actin coding, in- 
tron and 3'UTR sequences. The size of the uninterrupted 
genomic fragment for these clones was about 13.8kb. 

10 The EcoRl restriction endonuclease fragment 

carrying the p-actin gene, including its introns, in 
the nAN7pl KD, HuT-14 and HuT-14T recombinants is 8.2kb 
long (Table 3). By contrast, EcoRl fragments bearing 
the p-actin gene in Ml ( pi) -2, derived from the human 

15 fetal DNA library, appear to be only 6.6kb long. 

To determine whether the differences in frag- 
ment lengths was due to a restriction site polymorphism 
or represented parologous alleles, EcoR l digestion frag- 
ments of the three of the nAN7pi recombinant p-actin 

20 clones from HuT-14 DNA (14p-27(pl), 14p-29(pl), and 

14p-30(pl)) and the fetal gene clone Ml(pi)-2 were sub- 
cloned into pBR322 . These subclones were digested with 
EcoRl and the resulting fragments separated by agarose 
' gel electrophoresis. The blots were first hybridized 

25 to the p-actin intron I probe and then the same blot 
hybridized with the p-actin 3'UTR probe. The intron 
probe hybridized to the 8.2kb EcoRl fragment of 14p- 
27(pl), 14p-29(pl), and 14p-30(pl) and the 6.6kb EcoRl 
fragment of Ml (pi) -2. By contrast, the 3'UTR probe 

30 hybridized at the 7.1kb EcoRl DNA fragment, common to 
all four clones, as well as to the 8.2kb or 6.6kb EcoRl 
fragments containing the intron I sequences. This re- 
sult indicates that the genes isolated from HuT-14 and 
HuT-14T DNA differ from the fetus-derived gene in 

35 Ml ( pi) -2 in the location of an EcoRl site in the genomic 
DNA flanking the. 5' region of the p-actin gene. All 13 
independent nAN7pi recombinant clones derived from both 
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HuT libraries and one additional clone derived from the 
KD cell DNA library have an identical arrangement with 
regard to the positions of flanking EcoR I sites. The 
uninterrupted EcoR I fragment and the corresponding 
5 non-;tAN7 recombinant clones is 13.8kb, from which it is 
concluded that the p-actin gene probably resides on a 
13.8kb genomic EcoR I fragment. 

The sequences derived from the gene in Ml (pi) -2 
and from a cDNA clone (Ponte et al . , Nuc. Acids Res. 

10 (1984) 12:1687-1696) show that codons 243, 244, and 245 
(-Asp-Gly-Gln-) were encoded by GAC GGC CAA. Since the 
first p-actin mutation of HuT-14 resulted in an exchange 
of the glycine (codon 244) for an aspartic acid residue, 
the predicted sequence for codon 244 after the mutation ^ 

15 is GAC. The unmutated sequence GGCC (codons 244 and 

245) is a restriction site for the endonuclease Haelll, 
a site which should be absent in mutant copies of the 
gene from HuT-14 and HuT-14T. BstE II sites flank the 
mutation site and cleave between the codon 158 and 159 

20 and at a site 38bp into intron IV respectively. This 
BstE II fragment (366bp) was isolated from the DNA of 
three plasmid subclones of the HuT-14 7tAN7pi derived p- 
actin genes (the 8.2kb EcoR I fragment from 14p-27(pl), 
14p-29(pi), and 14p-30(pi) and three additional plasmid 

25 subclones from non-nAN7 derived HuT-14T p-actin genes 
(the 13. 8Kb EcoRI fragment from 14TP-17, 14Tp-21 and 
14Tp-24). Within this BstE II fragment there are Haelll 
sites at codons 182, 203, 204, 228 and 244, the site of 
the mutation (Fig. 2). Digestion of the BstE II fragment 

30 from the wild- type p-actin gene with Hae lll generates 
five restriction fragments of 71, 65, 72, 52 and 106bp, 
respectively, whereas the mutated gene missing the 
Hae lll site at codon 244 should produce four restriction 
fragments of 71, 65, 72 and 158bp. Four of six clones 

35 from HuT-14 (clones 14p-27(pl) and 14p-29(pl)) and HuT- 
14T (clones 14Tp-21 and 14Tp-24) exhibited the 158bp 
Hae lll - BstE II fragment indicative of copies of the gene 
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mutated at codon 244. The two remaining clones 14p- 
30(01) and 14Tp-i7 exhibited the wild-type digestion 
pattern indicative of the normal unmutated gene. Thus, 
the p-actin genes cloned from the HuT-14 and HuT-14T DNA 
5 libraries represent both the wild- type and mutant alleles. 
Furthermore, the presence of the predicted mutation in 
one of the alleles formally proves that these genes, and 
not the other EcoRl p-actin coding fragments, are the 
expressed p-actin genes in these human fibroblast strains. 

10 The sequences of the genes carrying the mutations con- 
firms that these genes are expressed. 

A p-actin expression vector providing the p- 
actin promoter region, a polylinker and a polyadenylation 
signal was constructed where the expression construct was 

15 present on a vector having a bacterial origin of replica- 
tion, as well as a marker for selection in a mammalian 
host. 

A 4.3kb EcoRI-AluI fragment containing 3.4kb 
of the DNA upstream of the CAP site plus 5 1 -untranslated 

20 region plus IVSI terminating at the splice junction was 
isolated such that the sequence terminates 6bp from the 
initiation codon; this fragment was obtained from clone 
14Tpl7. Plasmid pSP64 (Melton, et al^, Nucl. Acids 
Res- (1984) 12:7035-7056) was digested with BamHI, the 

25 overhang filled in with the Klenow fragment, followed 
by digestion with EcoR I and ligation to the EcoRI- Alu I 
p-actin fragment. The resulting plasmid was first di- 
gested with Hindlll, the Hindi I I site filled in with 
the Klenow fragment, followed by digestion with EcoRI 

30 to provide an EcoRI-flush Hindi I I fragment containing 
the p-actin sequence. 

Plasmid pcDVl (Okayama and Berg, Mol. Cell. 
Biol. (1983) 3:280-289) was employed for the SV40 poly- 
adenylation signal corresponding to a BamH I - Bel I (map 

35 positions 0.145 to 0.19) fragment. The Sai l and Acc I 

sites were destroyed by sequentially digesting the plas- 
mid with the appropriate restriction enzyme, removing 
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the overhang with SI nuclease and ligating the resulting 
flush ends. The resulting plasmid was then digested 
with Xhol, which is present proximal to the 5 1 -terminus 
of the SV40 polyadenylation signal containing fragment, 
5 the Xho l site filled in, followed by digestion of linear 
fragments with EcoR I to provide an EcoRI -flush Xho l frag- 
ment. This fragment was then ligated with the EcoR I - 
flush Hin di I I fragment containing the p-actin sequences. 
The resulting plasmid was digested with EcoR I and Cla l 

10 to provide a linear fragment containing the promoter 
region from p-actin, a polylinker sequence, and the 
SV40 polyadenylation site. 

Plasmids pSV2-neo (Southern and Berg, J. Mol. 
Appl. Genet. (1982) 1:327-341) and pSV2-gpt (Mulligan ^ 

15 and Berg, Proc. Natl. Acad, Sci. USA (1981) 78:2072-2076) 
were each sequentially digested with Hin di I I and BamH I, 
followed by filling in the overhang with the Klenow 
fragment and recircularizing. The resulting modified 
plasmids were then digested with Pvu II and EcoRI to 

20 provide new fragments having the SV40 origin and SV40 

promoter, and either the neomyosin phosphoryl transferase 
gene or xanthine guanine phosphoribosyl transferase gene, 
followed by the SV40 polyadenylation site. 

The neo fragment and gpt fragments were in- 

25 serted into the Clal -EcoRI fragment to provide expres- 
sion vectors which could be selected by G418 resistance 
or resistance to aminopterin and mycophenolic acid, 
respectively. The vectors were then ready for use for 
insertion of a gene for expression in a mammalian host 

30 under the regulatory control of the p-actin promoter 
and for selection of recipient mammalian cells. 

The following represents the complete se- 
quence for the p-actin gene, including flanking regions, 
which include the promoter region and the termination 

35 region, as well as the introns, indicating the splicing 
sites for the introns. 
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The sequence that codes for mRNA begins at 
nucleotide position 1, the nucleotides being numbered 
relative to the A of the cap site. The first intron 
begins at about nucleotide 79 and ends atposition 910, 
and is followed by a six member nucleotide sequence 
that codes for further 5' untranslated mRNA before 
translation commences at nucleotide 917. Nucleotides 
103 to 118 in intron I include a polymorphic region. 
In the human fibroblast gene derived from clones 14p27 
and 14T324, this polymorphic region is replaced by the 
sequence CAGGCGGCTCACGPCCCGPCCGGCAGGCPCCGGAC. For the 
human fibroblast gene derived from clone 14TB21, the 
polymorphic sequence is replaced by 
CAGCGGCCAGCGCCGCAGGCCGCGGCCC. Also, a 30 base-pair 
highly conserved, intervening sequence exists at bases 
752 to 781. Where the exact identity of a base has not 
been verified, P indicates a purine, Q refers to a 
pyrimidine, and N refers to any nucleotide. The amino 
acid sequence is numbered according to Lu and Elzinga, 
20 Biochemistry (1977) 5801-5806. 



10 



15 
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It is evident from the above results, that 
DNA sequences are provided which can be used for detect- 
ing polymorphisms, alleles and mutants of p-actin. In 
addition, the fragments of the sequences can be obtained 
5 by appropriately restricting the DNA, isolating individ- 
ual fragments, and using the fragments as regulatory 
signals or introns. As indicated, DNA sequences from 
various structural genes may be joined to one or more 
introns, as well as the transcriptional regulatory se- 

10 quence for p-actin to provide for constitutive efficient 
production of polypeptides of interest in appropriate 
mammalian hosts. 

Although the foregoing invention has been 
described in some detail by way of illustration and 

15 example for purposes of clarity of understanding, it 
will be obvious that certain changes and modifications 
may be practiced within the scope of the appended claims. 
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WHAT IS CLAIMED IS: 



1. A genomic DNA sequence of less than 15kb 
encoding for a human p-actin. 



2. A DNA sequence according to Claim 1, 
5 which is chromosomal and includes at least one intron. 



3. A DNA seguence of less than about lOOOkb 
including the p-actin transcriptional initiation region. 

4. A DNA sequence according to Claim 3 ex- 
tending downstream not farther than the twelfth nucleo- 

10 tide in the coding region. 

5. A DNA sequence according to Claim 4, 
having downstream from said transcriptional initiation, 
intron I. 



6* A DNA construct comprising a bacterial 
15 replication system and a sequence coding for at least 
one exon of a human p-actin. 



7. A construct according to Claim 6, includ- 
ing all of the exons of p-actin . 

8. A construct according to- Claim 6, wherein 
20 said exons are separated by p-actin introns. 

9. A DNA sequence coding for at least a 
substantial proportion of intron I having a flanking 
region adjacent a terminus of said intron I DNA sequence 
in the downstream order of transcription coding for 

25 other than p-actin. 

10. A DNA sequence including introns I, II, 
III, IV or V of p-actin or fragments thereof retaining 
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the splicing donor and acceptor terminal sequences, 
each of said introns or fragments substantially free of 
coding sequences of p-actin. 

11. A DNA intron sequence according to Claim 
5 10 flanked by expression sequences, which upon excision 

of said DNA intron sequence have an open reading frame. 

12. A DNA construct comprising a human p-actin 
transcriptional and translational sequence joined at 

its 3' -terminus to a DNA sequence coding for a polypep- 
10 tide other than p-actin either directly or through the 
intermediary of p-actin intron I, wherein said coding 
DNA sequence is joined at its 3 '-terminus to a trans- 
criptional termination region, with the proviso that 
said coding sequence may be interrupted by 0 to 4 p-actin 
15 introns other than intron I, or fragments thereof capable 
of excision in a mammalian host. 

13. A mammalian cellular host including a 
DNA construct according to Claim 12. 

14. A host according to Claim 13, wherein 
20 said host is a primate. 

15. A method for obtaining a polypeptide 
expression product which comprises: 

growing a host according to Claim 13; and 
isolating said polypeptide encoded for by 
25 said coding sequence. 
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